An a Priori Exponential Tail Bound for k-Folds Cross-Validation

نویسندگان

Karim T. Abou-Moustafa

Csaba Szepesvári

چکیده

We consider a priori generalization bounds developed in terms of cross-validation estimates and the stability of learners. In particular, we first derive an exponential Efron-Stein type tail inequality for the concentration of a general function of n independent random variables. Next, under some reasonable notion of stability, we use this exponential tail bound to analyze the concentration of the k-fold crossvalidation (KFCV) estimate around the true risk of a hypothesis generated by a general learning rule. While the accumulated literature has often attributed this concentration to the bias and variance of the estimator, our bound attributes this concentration to the stability of the learning rule and the number of folds k. This insight raises valid concerns related to the practical use of KFCV, and suggests research directions to obtain reliable empirical estimates of the actual risk.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A full NT-step O(n) infeasible interior-point method for Cartesian P_*(k) –HLCP over symmetric cones using exponential convexity

In this paper, by using the exponential convexity property of a barrier function, we propose an infeasible interior-point method for Cartesian P_*(k) horizontal linear complementarity problem over symmetric cones. The method uses Nesterov and Todd full steps, and we prove that the proposed algorithm is well define. The iteration bound coincides with the currently best iteration bound for the Ca...

متن کامل

An Exponential Tail Bound for Lq Stable Learning Rules. Application to k-Folds Cross-Validation

متن کامل

C Cross - Validation

Definition Cross-Validation is a statistical method of evaluating and comparing learning algorithms by dividing data into two segments: one used to learn or train a model and the other used to validate the model. In typical cross-validation, the training and validation sets must cross-over in successive rounds such that each data point has a chance of being validated against. The basic form of ...

متن کامل

The lower bound for the number of 1-factors in generalized Petersen graphs

‎In this paper‎, ‎we investigate the number of 1-factors of a‎ ‎generalized Petersen graph $P(N,k)$ and get a lower bound for the‎ ‎number of 1-factors of $P(N,k)$ as $k$ is odd‎, ‎which shows that the‎ ‎number of 1-factors of $P(N,k)$ is exponential in this case and‎ ‎confirms a conjecture due to Lovász and Plummer (Ann‎. ‎New York Acad‎. ‎Sci‎. ‎576(2006)‎, ‎no‎. ‎1‎, ‎389-398).

متن کامل

Stability of cross-validation and minmax-optimal number of folds

In this paper, we analyze the properties of cross-validation from the perspective of the stability, that is, the difference between the training error and the error of the selected model applied to any other finite sample. In both the i.i.d. and non-i.i.d. cases, we derive the upper bounds of the one-round and average test error, referred to as the one-round/convoluted Rademacher-bounds, to qua...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1706.05801 شماره

صفحات -

تاریخ انتشار 2017

An a Priori Exponential Tail Bound for k-Folds Cross-Validation

نویسندگان

چکیده

منابع مشابه

A full NT-step O(n) infeasible interior-point method for Cartesian P_*(k) –HLCP over symmetric cones using exponential convexity

An Exponential Tail Bound for Lq Stable Learning Rules. Application to k-Folds Cross-Validation

C Cross - Validation

The lower bound for the number of 1-factors in generalized Petersen graphs

Stability of cross-validation and minmax-optimal number of folds

عنوان ژورنال:

اشتراک گذاری